65 research outputs found

    Bridge the Gap Between CV and NLP! An Optimization-based Textual Adversarial Attack Framework

    Full text link
    Despite recent success on various tasks, deep learning techniques still perform poorly on adversarial examples with small perturbations. While optimization-based methods for adversarial attacks are well-explored in the field of computer vision, it is impractical to directly apply them in natural language processing due to the discrete nature of the text. To address the problem, we propose a unified framework to extend the existing optimization-based adversarial attack methods in the vision domain to craft textual adversarial samples. In this framework, continuously optimized perturbations are added to the embedding layer and amplified in the forward propagation process. Then the final perturbed latent representations are decoded with a masked language model head to obtain potential adversarial samples. In this paper, we instantiate our framework with an attack algorithm named Textual Projected Gradient Descent (T-PGD). We find our algorithm effective even using proxy gradient information. Therefore, we perform the more challenging transfer black-box attack and conduct comprehensive experiments to evaluate our attack algorithm with several models on three benchmark datasets. Experimental results demonstrate that our method achieves an overall better performance and produces more fluent and grammatical adversarial samples compared to strong baseline methods. All the code and data will be made public.Comment: Codes are available at: https://github.com/Phantivia/T-PG

    Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations

    Full text link
    This paper reexamines the research on out-of-distribution (OOD) robustness in the field of NLP. We find that the distribution shift settings in previous studies commonly lack adequate challenges, hindering the accurate evaluation of OOD robustness. To address these issues, we propose a benchmark construction protocol that ensures clear differentiation and challenging distribution shifts. Then we introduce BOSS, a Benchmark suite for Out-of-distribution robustneSS evaluation covering 5 tasks and 20 datasets. Based on BOSS, we conduct a series of experiments on pre-trained language models for analysis and evaluation of OOD robustness. First, for vanilla fine-tuning, we examine the relationship between in-distribution (ID) and OOD performance. We identify three typical types that unveil the inner learning mechanism, which could potentially facilitate the forecasting of OOD robustness, correlating with the advancements on ID datasets. Then, we evaluate 5 classic methods on BOSS and find that, despite exhibiting some effectiveness in specific cases, they do not offer significant improvement compared to vanilla fine-tuning. Further, we evaluate 5 LLMs with various adaptation paradigms and find that when sufficient ID data is available, fine-tuning domain-specific models outperform LLMs on ID examples significantly. However, in the case of OOD instances, prioritizing LLMs with in-context learning yields better results. We identify that both fine-tuned small models and LLMs face challenges in effectively addressing downstream tasks. The code is public at \url{https://github.com/lifan-yuan/OOD_NLP}.Comment: Accepted to NeurIPS 2023 Dataset and Benchmark Track. Code is available at \url{https://github.com/lifan-yuan/OOD_NLP

    From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework

    Full text link
    Textual adversarial attacks can discover models' weaknesses by adding semantic-preserved but misleading perturbations to the inputs. The long-lasting adversarial attack-and-defense arms race in Natural Language Processing (NLP) is algorithm-centric, providing valuable techniques for automatic robustness evaluation. However, the existing practice of robustness evaluation may exhibit issues of incomprehensive evaluation, impractical evaluation protocol, and invalid adversarial samples. In this paper, we aim to set up a unified automatic robustness evaluation framework, shifting towards model-centric evaluation to further exploit the advantages of adversarial attacks. To address the above challenges, we first determine robustness evaluation dimensions based on model capabilities and specify the reasonable algorithm to generate adversarial samples for each dimension. Then we establish the evaluation protocol, including evaluation settings and metrics, under realistic demands. Finally, we use the perturbation degree of adversarial samples to control the sample validity. We implement a toolkit RobTest that realizes our automatic robustness evaluation framework. In our experiments, we conduct a robustness evaluation of RoBERTa models to demonstrate the effectiveness of our evaluation framework, and further show the rationality of each component in the framework. The code will be made public at \url{https://github.com/thunlp/RobTest}.Comment: Accepted to Findings of ACL 202

    Photometry of Variable Stars from Dome A, Antarctica

    Get PDF
    Dome A on the Antarctic plateau is likely one of the best observing sites on Earth thanks to the excellent atmospheric conditions present at the site during the long polar winter night. We present high-cadence time-series aperture photometry of 10,000 stars with i<14.5 mag located in a 23 square-degree region centered on the south celestial pole. The photometry was obtained with one of the CSTAR telescopes during 128 days of the 2008 Antarctic winter. We used this photometric data set to derive site statistics for Dome A and to search for variable stars. Thanks to the nearly-uninterrupted synoptic coverage, we find 6 times as many variables as previous surveys with similar magnitude limits. We detected 157 variable stars, of which 55% are unclassified, 27% are likely binaries and 17% are likely pulsating stars. The latter category includes delta Scuti, gamma Doradus and RR Lyrae variables. One variable may be a transiting exoplanet.Comment: Accepted for publication in the Astronomical Journal. PDF version with high-resolution figures available at http://faculty.physics.tamu.edu/lmacri/papers/wang11.pd

    Photometric Variability in the CSTAR Field: Results From the 2008 Data Set

    Get PDF
    The Chinese Small Telescope ARray (CSTAR) is the first telescope facility built at Dome A, Antarctica. During the 2008 observing season, the installation provided long-baseline and high-cadence photometric observations in the i-band for 18,145 targets within 20 deg2 CSTAR field around the South Celestial Pole for the purpose of monitoring the astronomical observing quality of Dome A and detecting various types of photometric variability. Using sensitive and robust detection methods, we discover 274 potential variables from this data set, 83 of which are new discoveries. We characterize most of them, providing the periods, amplitudes and classes of variability. The catalog of all these variables is presented along with the discussion of their statistical properties.Comment: 38 pages, 11 figures, 4 tables; Accepted for publication in ApJ

    The First Release of the CSTAR Point Source Catalog from Dome A, Antarctica

    Get PDF
    In 2008 January the 24th Chinese expedition team successfully deployed the Chinese Small Telescope ARray (CSTAR) to DomeA, the highest point on the Antarctic plateau. CSTAR consists of four 14.5cm optical telescopes, each with a different filter (g, r, i and open) and has a 4.5degree x 4.5degree field of view (FOV). It operates robotically as part of the Plateau Observatory, PLATO, with each telescope taking an image every 30 seconds throughout the year whenever it is dark. During 2008, CSTAR #1 performed almost flawlessly, acquiring more than 0.3 million i-band images for a total integration time of 1728 hours during 158 days of observations. For each image taken under good sky conditions, more than 10,000 sources down to 16 mag could be detected. We performed aperture photometry on all the sources in the field to create the catalog described herein. Since CSTAR has a fixed pointing centered on the South Celestial Pole (Dec =-90 degree), all the sources within the FOV of CSTAR were monitored continuously for several months. The photometric catalog can be used for studying any variability in these sources, and for the discovery of transient sources such as supernovae, gamma-ray bursts and minor planets.Comment: 1 latex file and 9 figures The paper is accepted by PAS

    Eclipsing Binaries From the CSTAR Project at Dome A, Antarctica

    Get PDF
    The Chinese Small Telescope ARray (CSTAR) has observed an area around the Celestial South Pole at Dome A since 2008. About 20,00020,000 light curves in the i band were obtained lasting from March to July, 2008. The photometric precision achieves about 4 mmag at i = 7.5 and 20 mmag at i = 12 within a 30 s exposure time. These light curves are analyzed using Lomb--Scargle, Phase Dispersion Minimization, and Box Least Squares methods to search for periodic signals. False positives may appear as a variable signature caused by contaminating stars and the observation mode of CSTAR. Therefore the period and position of each variable candidate are checked to eliminate false positives. Eclipsing binaries are removed by visual inspection, frequency spectrum analysis and locally linear embedding technique. We identify 53 eclipsing binaries in the field of view of CSTAR, containing 24 detached binaries, 8 semi-detached binaries, 18 contact binaries, and 3 ellipsoidal variables. To derive the parameters of these binaries, we use the Eclipsing Binaries via Artificial Intelligence (EBAI) method. The primary and the secondary eclipse timing variations (ETVs) for semi-detached and contact systems are analyzed. Correlated primary and secondary ETVs confirmed by false alarm tests may indicate an unseen perturbing companion. Through ETV analysis, we identify two triple systems (CSTAR J084612.64-883342.9 and CSTAR J220502.55-895206.7). The orbital parameters of the third body in CSTAR J220502.55-895206.7 are derived using a simple dynamical model.Comment: 41 pages, 12 figures; published online in ApJ

    The sky brightness and transparency in i-band at Dome A, Antarctica

    Full text link
    The i-band observing conditions at Dome A on the Antarctic plateau have been investigated using data acquired during 2008 with the Chinese Small Telescope ARray. The sky brightness, variations in atmospheric transparency, cloud cover, and the presence of aurorae are obtained from these images. The median sky brightness of moonless clear nights is 20.5 mag arcsec^{-2} in the SDSS ii band at the South Celestial Pole (which includes a contribution of about 0.06 mag from diffuse Galactic light). The median over all Moon phases in the Antarctic winter is about 19.8 mag arcsec^{-2}. There were no thick clouds in 2008. We model contributions of the Sun and the Moon to the sky background to obtain the relationship between the sky brightness and transparency. Aurorae are identified by comparing the observed sky brightness to the sky brightness expected from this model. About 2% of the images are affected by relatively strong aurorae.Comment: There are 1 Latex file and 14 figures accepted by A
    • …
    corecore